Executing Parallel Programs with Synchronization Bottlenecks Efficiently

نویسندگان

  • Y. OYAMA
  • TAURA A. YONEZAWA
چکیده

We propose a scheme within which parallel programs with potential synchronization bottlenecks run e ciently. In the straightforward implementations which use basic locking schemes, the execution time for the program parts with bottlenecks increases signi cantly when the number of processors increases. Our scheme makes the parallel performance for the bottleneck parts of programs close to the sequential performance while maintaining the e ciency with which the nonbottleneck parts run. Experiments with a 64-processor SMP and a 128-processor DSM machine conrmed that parallel programs implemented with our scheme perform much better than parallel programs implemented with other widely-used locking schemes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Message Passing and Threads

In this chapter we examine two fundamental, although low-level, approaches to expressing parallelism in programs. Over the years, numerous different approaches to designing and implementing parallel programs have been developed (e.g., see the excellent survey article by Skillicorn and Talia [870]). However, over time, two dominant alternatives have emerged: message passing and multithreading. T...

متن کامل

Executing Communication-Intensive Irregular Programs Efficiently

We consider the problem of eÆciently executing completely irregular, communication-intensive parallel programs. Completely irregular programs are those whose number of parallel threads as well as the amount of computation performed in each thread vary during execution. Our programs run on MIMD computers with some form of space-slicing (partitioning) and time-slicing (scheduling) support. A hard...

متن کامل

Architectural and Software Support for Executing Numerical Applications on High Performance Computers By

Numerical applications require large amounts of computing power. Although shared memory multiprocessors provide a cost-e ective platform for parallel execution of numerical programs, parallel processing has not delivered the expected performance on these machines. There are two crucial steps in parallel execution of numerical applications: (1) e ective parallelization of an application and (2) ...

متن کامل

Multigranular Thread Support in WaveScalar

WaveScalar is a recently proposed scalable microarchitecture. The original WaveScalar research developed and evaluated an ISA and microarchitecture that efficiently executes a single, coarse-grain thread. In this paper, we expand that design to support multiple, simultaneously executing threads. Four mechanisms make this possible: (1) instructions that enable and disable wave-ordered memory; (2...

متن کامل

Survey of optimizing techniques for parallel programs running on computer clusters

In the current field of high performance computing, clusters technologies plays an ever increasing role. This paper tries to summarize state-of-the techniques for optimization of parallel programs designed to run on computer clusters. Optimizing parallel programs is a much harder task than optimizing sequential programs due to the increased complexity caused be communication and synchronization...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999